The Chaotic Nature of Faster Gradient Descent Methods

نویسندگان

  • Kees van den Doel
  • Uri M. Ascher
چکیده

The steepest descent method for large linear systems is well-known to often converge very slowly, with the number of iterations required being about the same as that obtained by utilizing a gradient descent method with the best constant step size and growing proportionally to the condition number. Faster gradient descent methods must occasionally resort to significantly larger step sizes, which in turn yields a rather non-monotone decrease pattern in the residual vector norm. We show that such faster gradient descent methods in fact generate chaotic dynamical systems for the normalized residual vectors. Very little is required to generate chaos here: simply damping steepest descent by a constant factor close to 1 will do. Several variants of the family of faster gradient descent methods are investigated, both experimentally and analytically. The fastest practical methods of this family in general appear to be the known, chaotic, two-step ones. Our results also highlight the need of better theory for existing faster gradient descent methods. 1 Faster gradient descent methods Many efforts have been devoted in the two decades that have passed since the pioneering paper of Barzilai & Borwein [4] to the design, analysis, extension and application of faster gradient descent methods for function minimization; see, e.g., [13, 26, 6, 12] and references therein. These are methods that converge significantly faster than the method of steepest descent although, unlike the conjugate gradients (CG) method, they confine their search directions to the gradient vector at each iteration. ∗Department of Computer Science, University of British Columbia, Canada, (kvdoel/[email protected]), supported in part by NSERC Discovery Grant 84306.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Fuzzy Based Method for Heart Rate Variability Prediction

Abstract In this paper, a novel technique based on fuzzy method is presented for chaotic nonlinear time series prediction. Fuzzy approach with the gradient learning algorithm and methods constitutes the main components of this method. This learning process in this method is similar to conventional gradient descent learning process, except that the input patterns and parameters are stored in mem...

متن کامل

Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property

Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...

متن کامل

Conjugate gradient neural network in prediction of clay behavior and parameters sensitivities

The use of artificial neural networks has increased in many areas of engineering. In particular, this method has been applied to many geotechnical engineering problems and demonstrated some degree of success. A review of the literature reveals that it has been used successfully in modeling soil behavior, site characterization, earth retaining structures, settlement of structures, slope stabilit...

متن کامل

An eigenvalue study on the sufficient descent property of a‎ ‎modified Polak-Ribière-Polyak conjugate gradient method

‎Based on an eigenvalue analysis‎, ‎a new proof for the sufficient‎ ‎descent property of the modified Polak-Ribière-Polyak conjugate‎ ‎gradient method proposed by Yu et al‎. ‎is presented‎.

متن کامل

Faster gradient descent and the efficient recovery of images

Much recent attention has been devoted to gradient descent algorithms where the steepest descent step size is replaced by a similar one from a previous iteration or gets updated only once every second step, thus forming a faster gradient descent method. For unconstrained convex quadratic optimization these methods can converge much faster than steepest descent. But the context of interest here ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Sci. Comput.

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2012